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What is claimed is: 

1 . A metliod of profiling the genomic regulatory regions of a biological 
sample, comprising: 

(1 ) contacting a sample of nucleic acid from a biological sample, with a 
positionally addressable anray of polynucleotides under conditions 
such that hybridization can occur, said sample of nucleic acid being 
enriched in ACEs or fragments thereof of at least 1 0 base pairs; 
and 

(2) detecting loci on the an^y where hybridization occurs, 
wherein said ACEs are each a nucleotide sequence characterized as 

being hypersensitive to a DNA modifying agent relative to a nearby region when present 
in chromatin isolated from one or more cells, has a size in the range of 80-250 base 
pairs, and is bound by one or more sequence-specific DNA binding factors when 
present in chromatin isolated from one or more cells, 

and wherein said array of polynucleotides comprises a plurality of 
polynucleotides, each affixed to a substrate, said plurality comprising different 
polynucleotides differing in nucleotide sequence and being situated at distinct loci of the 
array, said different polynucleotides being complementary and hybridizable to genomic 
DNA of said biological sample, 

thereby profiling the genomic regulatory regions of the biological sample. 

2. The method of claim 1 , wherein said plurality of polynucleotides is at least 
500 different polynucleotides, at least 1 ,000 different polynucleotides, at least 5,000 
different polynucleotides, at least 10,000 different polynucleotides, or at least 20,000 
different polynucleotides. 

3. The method of claim 1 , wherein each said ACE is further characterized as 
having one or more of the following characteristics: 
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(1 ) an intrinsic ability to confer hypersensitivity to the DNA 
modifying agent when excised from its native location and 
inserted into at least one different location in the genome of 
a cell of the same cell type; 

(2) a greater hypersensitivity to the DNA modifying agent 
relative to the nearby region, wherein said hypersensitivity is 
10-50 times greater hypersensitivity, 50-100 times greater 
hypersensitivity, 100-150 times greater hypersensitivity or 
150-200 times greater hypersensitivity to the DNA modifying 
agent relative to the nearby region; 

(3) the ability to reconstitute a site that is hypersensitive to the 
DNA modifying agent when a nucleic acid comprising the 
nucleotide sequence flanked by at least 1000 bp on each 
side is assembled into chromatin in an in vitro reconstitution 
assay in the presence of nucleosomal proteins and a ceil 
extract; 

(4) Is non-nucleosomal when present in chromatin isolated from 
one or more cells; 

(5) is embedded in DNA associated with histones that have a 
high degree of acetylation when present in chromatin 
isolated from one or more cells; 

(6) greater solubility than nucleosomal material in moderate salt 
solutions (e.g., 150 mM NaCI and 3mM MgClz) when present 
in chromatin isolated from one or more cells; 

(7) is a non-coding sequence; or 

(8) does not occur greater than 10 times in a genome of the 
organism in which the ACE Is identified. 

4. A positionally addressable polynucleotide anray comprising a plurality of 
different polynucleotides, each different polynucleotide (a) differing in nucleotide 
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sequence, (b) being affixed to a substrate at a different locus, (c) being in tlie range of 
10-1000 nucleotides in length, and (d) being complementary and hybridizable to a 
predetemiined ACE, each said ACE being a nucleotide sequence characterized as 
being hypersensitive to a DNA modifying agent relative to a nearby region when present 
In chromatin isolated from one or more cells, has a size in the range of 80-250 base 
pairs, and is bound by one or more sequence-specific DNA binding factors when 
present in chromatin isolated from one or more cells, and 

wherein the loci at which said different polynucleotides are situated are at 
least 15% of the total loci of the an^y. 

5. The positionally addressable polynucleotide array of claim 4 in which each 
different polynucleotide is greater than 30 nucleotides and is designed so as not to 
contain a sequence of in the range of 15-30 nucleotides that occurs in the genome of 
the organism from which the ACEs are identified greater than 10 times. 

6. The positionally addressable polynucleotide array of claim 5, wherein 
each said different polynucleotide is designed by a method comprising 

(a) identifying by comparing to an Indexed polynucleotide set a sequence in said 
different polynucleotide, wherein said sequence consists of a nucleotide sequence in 
the range of 10-15 nucleotides and has a frequency count less than 1 1 in the genome of 
said organism, and wherein said indexed polynucleotide set contains binary encoded 
nucleotide sequences of sizes in the range of 10-15 nucleotides; 

(b) detennining the genomic locations of said sequence from said indexed 
polynucleotide set; 

(c) adding prefix and suffix nucleotide sequences to said sequence according to 
the genomic sequence at each of said genomic locations to generate a set of candidate 
polynucleotides; and 

(d) accepting a polynucleotide from said set of candidate polynucleotides if the 
respective alignment of the sequences of its added prefix and suffix sequences and the 



196 



wo 2004/052080 



PCTAJS2003/039645 



prefix and suffix sequences of said sequence in the conresponding predetemiined ACE 
is above a given threshold. 

7. A positionally addressable polynucleotide anray to which nucleic acids are 
hybridized, said anray comprising a plurality of different polynucleotides, each different 
polynucleotide (a) differing in nucleotide sequence and (b) being affixed at a different 
locus to a substrate, said nucleic acids being enriched in ACEs or fragments thereof of 
at least 10 base pairs, each said ACE being a nucleotide sequence characterized as 
being a nucleotide sequence characterized as being hypersensitive to a DNA modifying 
agent relative to a nearby region when present in chromatin isolated from one or more 
cells, has a size in the range of 80-250 base pairs, and is bound by one or more 
sequence-specific DNA binding factors when present in chromatin isolated from one or 
more cells, said nucleic acids being hybridized to one or more discrete loci on the anray. 

8. A positionally addressable polynucleotide anray to which nucleic acids are 
hybridized, said an^ay comprising a plurality of different polynucleotides, each different 
polynucleotide (a) differing in nucleotide sequence, (b) being affixed at a different locus 
to a substrate, (c) being in the range of 10-1000 nucleotides in length, and (d) being 
complementary and hybridizable to a predetemriined ACE. each said ACE being a 
nucleotide sequence characterized as being a nucleotide sequence characterized as 
being hypersensitive to a DNA modifying agent relative to a neartDy region when present 
in chromatin isolated from one or more cells, has a size in the range of 80-250 base 
pairs, and is bound by one or more sequence-specific DNA binding factors when 
present in chromatin isolated from one or more cells, and 

wherein the loci at which said different polynucleotides are situated are at 
least 1 5% of the total loci of the array. 

9. A positionally addressable polynucleotide array to which nucleic acids are 
hybridized, said array comprising a plurality of difl'erent polynucleotides, each different 
polynucleotide (a) differing in nucleotide sequence, (b) being affixed at a different locus 
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to a substrate, (c) being in the range of 10-1000 nudeotides in length, and (d) being 
cx)mplementary and hybridizable to a predetemiined ACE, each said ACE being a 
nucleotide sequence characterized as said ACE being a nucleotide sequence 
characterized as being hypersensitive to a DNA modifying agent relative to a nearby 
region when present in chromatin isolated from one or more cells, has a size in the 
range of 80-250 base pairs, and is bound by one or more sequence-specific DNA 
binding factors when present in chromatin isolated from one or more cells, 

wherein the loci at which said different polynucleotides are situated are at 
least 15% of the total loci of the anray; 

and wherein said nucleic acids are enriched in ACEs or fragments thereof 
of at least 10 base pairs. 

1 0. The positionally addressable polynucleotide anray of claim 4, 7, 8, or 9, 
wherein said plurality of polynucleotides is at least 500 different polynucleotides, at least 
1,000 different polynucleotides, at least 5,000 different polynucleotides, at least 10,000 
different polynucleotides, or at least 20,000 different polynucleotides. 

1 1 . The positionally addressable polynucleotide array of claim 4, 7, 8, or 9, 
wherein each said ACE is further characterized as having one or more of the following 
characteristics: 

(1 ) an intrinsic ability to confer hypersensitivity to the DNA 
modifying agent when excised from its native location and 
inserted into at least one different location in the genome of 
a cell of the same cell type; 

(2) a greater hypersensitivity to the DNA modifying agent 
relative to a nearby region, wherein said hypersensitivity is 
10-50 times greater hypersensitivity, 50-100 times greater 
hypersensitivity, 100-150 times greater hypersensitivity or 
150-200 times greater hypersensitivity to the DNA modifying 
agent relative to the nearby region; 
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' — <3) the ability to reconstitute a site that is hypersensitive to the 
DNA modifying agent when a nucleic acid comprising the 
nucleotide sequence flanked by at least 1000 bp on each 
side is assembled into chromatin in an in vitro reconstitution 
assay in the presence of nucleosomal proteins and a cell 
extract; 

(4) is non-nucleosomai when present in chromatin isolated from 
one or more cells; 

(5) is embedded in DNA associated with histones that have a 
high degree of acetylation when present in chromatin 
isolated from one or more cells; 

(6) greater solubility than nucleosomal material in moderate salt 
solutions (e.g., 150 mM NaCI and 3mM MgCb) when present 
in chromatin isolated from one or more cells; 

(7) is a non-coding sequence; or 

(8) does not occur greater than 1 0 times in a genome of the 
organism in which the ACE is identified. 

12. A method for profiling chromatin sensitivity of a genomic region of cells of a cell type 
to digestion by a DNA modifying agent, comprising detemnining a chromatin 
sensitivity profile, said chromatin sensitivity profile comprising a plurality of replicate 
measurements of each of a plurality of different genomic sequences in said genomic 
region, wherein each of said plurality of replicate measurements is a ratio of (i) the 
intensity of signal of a test probe made from a treated cell type following 
hybridization to a microanray and (ii) the intensity of hybridization of a reference 
probe of said cell type that has not been treated with said DNA modifying agent. 

13. The method of claim 12, wherein said plurality of different genomic sequences 

comprises successively overiapping sequences tiled across one or more portions 
of said genomic region. 
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14. The method of claim 13, wherein said plurality of different genomic sequences 

comprises successively overlapping sequences tiled across said genomic region. 

15. The method of claim 12, wherein each of said plurality of different genomic 

sequences has a length in the range of about 75 to about 300 bases. 

16. The method of claim 15, wherein said plurality of different genomic sequences 

comprises successively overlapping sequences tiled across said genomic region. 

17. The method of claim 12, wherein each of said plurality of different genomic 

sequences has a length in the range of about 25 to about 80 bases. 

18. The method of claim 17, wherein the mean length of said plurality of different 

genomic sequences is about 40 bases. 

19. The method of claim 12, wherein said plurality of duplicate measurements consists 

of at least 3 duplicate measurements. 

20. The method of claim 19, wherein said plurality of duplicate measurements consists 

of at least 6 duplicate measurements. 

21. The method of claim 20, wherein said plurality of duplicate measurements consists 

of at least 9 duplicate measurements. 

22. The method of claim 12, further comprising determining a baseline chromatin 

sensitivity profile by a method comprising 

(a) smoothing the data in said chromatin sensitivity profile to obtain a baseline curve; 

and 

(b) detenmining the enx)r bounds for said baseline curve. 
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wherein said baseline curve and said enror bounds constitute said baseline chromatin 
profile. 



23. The method of claim 22, wherein said smoothing is canied out using LOWESS. 



24. The method of claim 22. wherein said enx)r bounds are determined by a method 
comprising 

(b1) mean centering said plurality of replicates for each genomic sequence in said 
chromatin sensitivity profile about said baseline curve to generate a mean- 
centered chromatin sensitivity profile, wherein said mean-centering Is canied out 
by setting the mean of each said plurality of replicates to the value of the 
corresponding genomic sequence on said baseline curve; 

(b2) detemilning the median M of said mean-centered chromatin sensitivity profile; 

(bS) detenmining the Median Average Deviation MAD of said mean-centered chromatin 
sensitivity profile; 

{b4) discarding for each genomic sequence replicate measurement X if X satisfy 
equation 

l^-*^! >2^.and 
MAD/0.6145 

(b5) defining the error bounds as the lower and upper confidence limits on the 
remaining data. 



25. The method of claim 22, wherein said en-or bounds are determined by a method 
comprising 

(b1) generating a bootstrap chromatin sensitivity profile by randomly selecting one 
replicate measurement from said plurality of replicate measurements for each 
genomic sequence; 

(b2) mean centering said plurality of replicates for each genomic sequence in said 
bootstrap chromatin sensitivity profile about said baseline curve to generate a 
mean-centered chromatin sensitivity profile, wherein said mean-centering Is 
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carried out by setting the mean of each said plurality of replicates to the value of 
the con'esponding genomic sequence on said baseline curve; 

(b3) detennining the median M of said mean-centered chromatin sensitivity profile; 

(b4) detennining the Median Average Deviation MAD of said mean-centered chromatin 
sensitivity profile; 

(b5) discarding for each genomic sequence replicate measurement X if X satisfy 
equation 

MAD/ 0.6145 ' ' 

(b5) detennining the maximum lower and minimum upper outliers on the remaining 
data; 

(b6) repeating said step (b1Hb5) for a plurality of times; and 

(b7) calculating the upper and lower outlier cutoff values and Bca confidence intervals. 

26. The method of claims 24, further comprising 

(c1) identifying one or more genomic sequences among said plurality of genomic 
sequences whose 20% trimmed means lie outside said en^or bounds; and 

(c2) determining a signal-to-noise ratio S/N of said identified genomic sequences 
according to equation 

S!N- 

where S/N^ is the signal-to-noise ratio at site f , HSf is the Y% trimmed mean of the 
conresponding HS cluster, 6/ is the value of said baseline curve at said site /, 
MADb is the median average deviation of the centered baseline, a„s is the 

average variance of replicate measurements, and is the variance of the 
replicate measurements at said site /. 

27. The method of claims 25, further comprising 

(c1) identifying one or more genomic sequences among said plurality of genomic 
sequences whose 20% trimmed means lie outside said error bounds; and 
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(c2) determining a signal-to^noise ratio S/N of said identified genomic sequences 
according to equation 

where S/Ni Is the signal-to-noise ratio at site i, HSi is the Y% trimmed mean of the 
con-esponding HS cluster, 6/ is the value of said baseline curve at said site /, 
MADb is the median average deviation of the centered baseline, o-^ is the 
average variance of replicate measurements, and is the variance of the 
replicate measurements at said site /. 

28. The method of any one of claims 12-27, wherein each said copy number has been 

con-ected for amplification efficiency. 

29. The method of any one of claims 12-27, wherein said DNA modifying agent is 

DNase I. 

30. The method of any one of claims 12-27, wherein each of said plurality of duplicated 

measurements is measured by independent microarray hybridization 
experiments. 

31. The method of any one of claims 12-27, wherein each of said plurality of duplicated 

measurements is measured by independent microarray hybridization 
experiments using different treated chromatin samples. 

32. A method for profiling chromatin sensitivity of a genomic region of cells of a cell type 

to digestion by a DNA modifying agent, comprising 

(a) treating chromatin of cells of said cell type with said DNA modifying agent such that 

digestion of DNA occurs and retrieving DNA molecules; 

(b) creating Test probes by various methods from chromatin of cells of said type treated 

by said DNA modifying agent; 
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(c) creating Reference probes by various methods from ctiromatin of cells of said type 

untreated by said DNA modifying agent; 

(d) determining a ratio of intensity of hybridization signal of probes described in step (b) 

and step (c) following hybridization to a microanray; 

(e) repeating said steps (b) - (d) a plurality of times to generate a plurality of ratios, 

thereby generating a plurality of replicate measurements for each of said 
genomic sequences; and 
(d) detemiining a chromatin sensitivity profile of said genomic region, said chromatin 
sensitivity profile comprising said plurality of replicate measurements. 

33. The method of claim 32, wherein said plurality of different genomic sequences 

comprises successively overiapping sequences tiled across one or more portions 
of said genomic region. 

34. The method of claim 33, wherein said plurality of different genomic sequences 

comprises successively overiapping sequences tiled across said genomic region. 

35. The method of claim 32, wherein each of said plurality of different genomic 

sequences has a length in the range of about 75 to about 300 bases. 

36. The method of claim 35, wherein the mean length of said plurality of different 

genomic sequences is about 250 bases. 

37. The method of claim 32, wherein each of said plurality of different genomic 

sequences has a length in the range of about 25 to about 80 bases. 

38. The method of claim 37, wherein the mean length of said plurality of different 

genomic sequences is about 40 bases. 
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39. The method of claim 32, wherein said plurality of duplicate measurements consists 

of at least 3 duplicate measurements. 

40. The method of claim 39, wherein said plurality of duplicate measurements consists 

of at least 6 duplicate measurements. 

41. The method of claim 40, wherein said plurality of duplicate measurements consists: 

of at least 9 duplicate measurements. 

42. The method of claim 32, further comprising detemiining a baseline chromatin 

sensitivity profile by a method comprising 

(a) smoothing the data in said chromatin sensitivity profile to obtain a baseline curve; 

and 

(b) determining the enror bounds for said baseline curve, 

wherein said baseline curve and said en-or bounds constitute said baseline chromatin 
profile. 

43. The method of claim 42, wherein said smoothing is earned out using LOWESS. 

44. The method of claim 42, wherein said error bounds are detenfnined by a method 

comprising 

(b1) mean centering said plurality of replicates for each genomic sequence in said 
chromatin sensitivity profile about said baseline curve to generate a mean- 
centered chromatin sensitivity profile, wherein said mean-centering is carried out 
by setting the mean of each said plurality of replicates to the value of the 
corresponding genomic sequence on said baseline curve; 

(b2) determining the median M of said mean-centered chromatin sensitivity profile; 

(b3) detennining the Median Average Deviation MAD of said mean-centered chromatin 
sensitivity profile; 
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(b4) discarding for each genomic sequence replicate measurement X if X satisfy 
equation 

1-^-^' >2.24,and 
MAD/0.6745 

(b5) defining tlie error bounds as the lower and upper confidence limits on the 
remaining data. 

45. The method of claim 42, wherein said error bounds are detennined by a method 

comprising 

(b1) generating a bootstrap chromatin sensitivity profile by randomly selecting one 
replicate measurement from said plurality of replicate measurements for each 
genomic sequence; 

(b2) mean centering said plurality of replicates for each genomic sequence in said 
bootstrap chromatin sensitivity profile about said baseline curve to generate a 
mean-centered chromatin sensitivity profile, wherein said mean-centering is 
carried out by setting the mean of each said plurality of replicates to the value of 
the corresponding genomic sequence on said baseline curve; 

(b3) detenmining the median M of said mean-centered chromatin sensitivity profile; 

(b4) detemiining the Median Average Deviation MAD of said mean-centered chromatin 
sensitivity profile; 

(b5) discarding for each genomic sequence replicate measurement X if X satisfy 
equation 

! L— > 2 24 

AMjD/0.6745 ' ' 

(b5) detennining the maximum lower and minimum upper outliers on the remaining 
data; 

{b6) repeating said step (b1Hb5) for a plurality of times; and 

(b7) calculating the upper and lower outlier cutoff values and Bca confidence intervals. 

46. The method of claims 44, further comprising 
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(c1) identifying one or more genomic sequences among said plurality of genomic 
sequences winose 20% trimmed means lie outside said error bounds; and 

(c2) determining a signal-to-noise ratio S/N of said identified genomic sequences 
according to equation 



\HS,-B, 



where SIN^ is the signal-to-noise ratio at site /, HS/ is the Y% trimmed mean of the 
conresponding HS cluster, S,- is the value of said baseline curve at said site /, 
AMDs is the median average deviation of the centered baseline, o-^^ Is the 

average variance of replicate measurements, and is the variance of the 
replicate measurements at said site /. 

47. The method of claims 44, further comprising 

(c1) identifying one or more genomic sequences among said plurality of genomic 
sequences whose 20% trimmed means lie outside said error bounds; and 

(c2) determining a signal-to-noise ratio S/N of said identified genomic sequences 
according to equation 

where SIN^ is the signal-to-noise ratio at site z, HSi Is the Y% trimmed mean of the 
conresponding HS cluster, 6/ is the value of said baseline curve at said site /, 
MADb is the median average deviation of the centered baseline, a^s is the 
average variance of replicate measurements, and is the variance of the 
replicate measurements at said site /. 

48. The method of any one of claims 32-47, wherein each said hybridization intensity 

has been nomialised. 
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49. The method of any one of claims 32-48, wherein said DNA modifying agent is 

DNase I. 

50. The method of any one of claims 32-47, wherein each of said plurality of duplicated 

measurements is measured by independent microan-ay hybridization 
experiments. 

51. The method of any one of claims 32-47, wherein each of said plurality of duplicated 

measurements is measured by independent microanay hybridization 
experiments using different treated chromatin samples. 

52. The method of any one of claims 26-27 and 46-47, wherein said Y% trimmed mean 

is 20% trimmed mean. 
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